|
Locality-sensitive hashing (LSH) reduces the dimensionality of high-dimensional data. LSH hashes input items so that similar items map to the same “buckets” with high probability (the number of buckets being much smaller than the universe of possible input items). LSH differs from conventional and cryptographic hash functions because it aims to maximize the probability of a “collision” for similar items. Locality-sensitive hashing has much in common with data clustering and nearest neighbor search. ==Definition== An ''LSH family''〔 is defined for a metric space , a threshold and an approximation factor . This family is a family of functions which map elements from the metric space to a bucket . The LSH family satisfies the following conditions for any two points , using a function which is chosen uniformly at random: * if , then (i.e., and collide) with probability at least , * if , then with probability at most . A family is interesting when . Such a family is called ''-sensitive''. Alternatively it is defined with respect to a universe of items that have a similarity function . An LSH scheme is a family of hash functions coupled with a probability distribution over the functions such that a function chosen according to satisfies the property that for any . 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Locality-sensitive hashing」の詳細全文を読む スポンサード リンク
|